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Abstract 


Classical  or  ordinary  least  squares  (OLS)  is  one  of  the  most  commonly 
used  criteria  for  fitting  data  to  models  and  for  estimating  parameters.  This 
is  true  even  when  a  key  assumption  for  its  use,  namely  that  the  independent 
variables  are  known  exactly,  is  violated.  Orthogonal  distance  regression 
(ODR)  extends  least  squares  data  fitting  to  problems  with  independent 
variables  that  are  not  known  exactly.  Theoretical  analysis,  however,  show's 
OLS  is  preferable  to  ODR  for  straight  line  functions  tinder  certain  condi¬ 
tions,  even  when  there  are  measurement  errors  in  the  independent  variable. 
This  has  lead  some  to  conjecture  that  under  some  similar  conditions  OLS 
will  also  be  preferable  to  ODR  for  nonlinear  functions  even  though  there 
are  errors  in  the  independent  variable.  ' 

- i in- this  paper. -we  present^ he  results  of  an  empirical  study  designed  to 
examine  whether  ODR  provides  better  results  than  OLS  when  there  are  er¬ 
rors  in  the  independent  variable.  We  examine-  a  variety  of  functions,  both 
linear  and  nonlinear,  under  a  variety  of  experimental  conditions.  The  re¬ 
sults  indicate  that,  for  the  data  and  performance  criteria  considered,  ODR 
never  performs  appreciably  worse  than  OLS  and  sometimes  performs  con¬ 
siderably  better.  This  leads^us  to  the  conclusion  that  ODR  is  appropriate 
for  a  wide  variety  of  practical  problems. 
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1  Introduction 


Of  all  of  the  criteria  used  for  fitting  data  to  models  or  for  estimating  param¬ 
eters,  classical  or  ordinary  least  squares  (OLS)  is  the  one  most  commonly 
used.  This  continues  to  be  the  case  even  when  the  assumptions  required 
to  fully  justify  its  use  are  not  completely  met.  Orthogonal  distance  regres¬ 
sion ,  or  ODR.  is  designed  to  extend  least  squares  data  fitting  to  a  class  of 
problems  which  violate  a  key  assumption  for  the  use  of  OLS  in  parameter 
estimation,  namely  that  the  independent  variables,  r,.  are  known  exactly. 
In  this  paper,  we  compare  the  performance  of  OLS  with  that  of  ODR  when 
this  key  assumption  for  OLS  is  violated. 

The  data  fitting  problem  arises  by  considering  a  data  set  (r;,y,),  i  = 

1 . n.  that  has  been  collected  and  a  model  that  is  purported  to  explain 

the  relationship  of  y,  €  R1  to  x,  €  Rm.  Specifically,  if  we  assume  there  is 
no  error  in  (x,,  y,)  and  the  true  or  actual  value  of  the  parameter  vector 
3a  e  Rp  is  known,  then 

y,  =  f(x,:3e) 

where  /  is  a  smooth  function  that  can  be  either  linear  or  nonlinear  in  x,  and 
3.  Alternatively,  if  we  suppose  that  the  observations  y,  contain  actual,  but 
unknown,  additive  errors  e“  €  R1 .  and  that  the  observations  x,  are  known 
exactly,  then  y,  satisfies 

y,  =  /(x,:  /?° )  —  »  =  1 . n.  (1.1) 

Finally,  if  we  allow  there  to  be  additive  errors  in  both  x,  and  y,,  then  the 
data  satisfy 

V,  =/{x,  +  6,B:£fl)-f“  i  =  1 . n.  (1.2) 

were  6“  €  R™  is  the  actual,  but  unknown,  additive  error  in  x,.  (Note  that 
we  have  chosen  the  signs  of  c,  and  6,  for  convenience  and  consistency  with 
other  work.) 

Using  OLS,  3°  is  approximated  by  finding  the  30LS  for  which  the  sum  of 
the  squares  of  the  n  vertical  distances  from  the  curve  /(x,:  3)  to  the  n  data 
points  is  minimized.  This  is  accomplished  by  the  minimization  problem 

nun  £tr,2(/(x,;d)- y,)2  (1.3) 


where  u>,.  i  =  1, . . .  ,n,  are  non-negative  numbers  that  allow  the  procedure 
to  be  applied  to  problems  when  the  observations  should  be  weighted  differ¬ 
ently.  When  (1.1)  is  satisfied  and  e  =  (ei,...,en)T  ~  N(0,<72/),  then  (1.3) 
with  each  u\  =  1  results  in  the  maximum  likelihood  estimator  of  fP. 

Since  (1.3)  assigns  all  errors  to  y„  a  critical  assumption  of  OLS  in  pa¬ 
rameter  estimation  is  that  there  are  no  errors  in  x,.  When  this  assumption 
is  violated,  use  of  OLS  does  not  appear  to  be  fully  justified,  and  may  not 
produce  good  estimates  [Ful87,  Mor71|. 

ODR.  on  the  other  hand,  does  allow  for  errors  in  x,.  ODR  approximates 
3a  by  finding  that  3  for  which  the  sum  of  the  squares  of  the  n  weighted 
orthogonal  distances  from  the  curve  /(x,:  5)  to  the  n  data  points  is  mini¬ 
mized.  The  estimated  parameters,  $°DR,  are  then  those  values  that  solve 
the  minimization  problem 

min Su’.2  [(/(*•  +  8)  ~  V '.)*  +  4TP?*.]  •  d-4) 

1=1 

where  p,  €  i  =  1 . n,  is  a  set  of  positive  diagonal  matrices  that 

allow  c,  and  6,  to  have  different  variances  [BogBS85].  When  (1.2)  is  satisfied 

and  e.6i . b„  are  independent  and  distributed  as  e  ~  N(0,a?J)  and  b ,  ~ 

N(0,  cr]p~' ),  then  (1.4)  with  each  u',  =  1  results  in  the  maximtim  hkelihood 
estimator  of  3°.  In  the  most  common  use  of  ODR,  it  is  assumed  that  each 
p,  =  pi.  where  p  is  the  ratio  of  the  standard  deviations  of  the  errors  in  the 
y  and  x  data,  i.e.,  p  =  ot/oi- 

In  this  paper,  we  present  the  results  of  an  empirical  study  designed  to 
examine  whether  ODR  provides  better  results  than  OLS  when  there  are 
errors  in  both  x,  and  y,.  We  examine  a  variety  of  functions,  including 
functions  nonlinear  in  x  and  3 .  under  a  variety  of  experimental  conditions. 
While  the  statistical  properties  of  the  estimators  from  ODR  fits  are  not  yet 
well  understood.  [ReiGL86.  Ful87]  show  that  there  are  theoretical  reasons 
to  prefer  OLS  to  ODR  for  a  straight  line  function  in  certain  situations  even 
though  there  are  errors  in  the  observations  x, .  This  has  led  some  to  con¬ 
jecture  that  under  some  similar  conditions  we  should  also  ignore  the  errors 
in  x,  when  fitting  models  which  are  nonlinear  in  3  or  x.  The  results  of  our 
study  indicate  that  this  is  probably  not  the  case  for  the  measures  we  have 
chosen.  Specifically,  for  the  data  and  performance  criteria  considered,  ODR 


never  performs  appreciably  worse  than  OLS  and  sometimes  performs  con¬ 
siderably  better.  This  leads  us  to  the  conclusion  that  ODR  is  appropriate 
in  a  wide  variety  of  practical  problems. 

To  our  knowledge,  this  is  the  first  extensive  computational  study  of  the 
errors  in  variable  problem  with  nonlinear  functions.  Previous  work  [e.g., 
Ful87,  Mor71]  has  mainly  concentrated  on  analytical  analysis  of  the  straight 
line  function.  We  believe  that  this  is  partially  due  to  the  fact  that  until  now. 
the  ODR  problem  has  been  relatively  expensive  to  solve  and  the  necessary 
software  has  not  been  readily  available.  [BogBSS5].  however,  presents  a 
trust-region,  Levenberg-Marquardt  algorithm  that  exploits  the  structure  of 
the  ODR  problem  to  obtain  a  procedure  that  is  both  stable  and  efficient 
The  order  of  operations  per  iteration,  and  the  constant  for  the  highest  order 
term,  are  the  same  for  the  algorithm  developed  in  [BogBSS5]  as  for  a  trust 
region.  Levenberg-Marquardt  solution  of  the  OLS  problem,  namely  0(np2 ) 
operations  per  iteration.  (A  straight  forward  use  of  an  OLS  algorithm 
on  (1.4)  would  require  0(n(n  -fp)2)  operations  per  iteration  [BogBSS5] 
which  is  clearly  prohibitive  for  large  values  of  n.)  The  algorithm  described 
in  [BogBSSoJ  has  been  implemented  in  the  portable  Fortran  subroutine 
library  ODRPACK  [BogBDSST].  The  availability  of  ODRPACK  makes  it 
reasonable  to  conduct  the  study  reported  here. 

We  emphasize  that  this  study  is  only  a  first  step  and  that  we  have 
left  many  important  questions  unanswered.  Some  of  these  are  discussed 
in  §2  where  we  detail  the  motivation  for  our  study  and  its  scope.  We 
outline  our  Monte  Carlo  procedure  in  §3,  and  in  §4  we  summarize  our 
observations  and  present  our  conclusions.  Our  plans  for  future  work  are 
given  in  §5.  A  detailed  description  of  our  results  and  the  accompanying 
figures  are  presented  in  the  Appendix. 

2  Motivation 

While  ODRPACK  provides  an  efficient  means  of  solving  ODR.  it  is  not 
known  whether  there  is  a  “theoretical'1  penalty  for  using  ODR  since  the 
theoretical  analysis  of  ODR  is  not  yet  available  for  functions  other  than  a 
straight  line. 

For  a  straight  line,  [ReiGL86]  notes  that  OLS  results  in  a  smaller  mean 


square  error  of  the  slope  than  ODR  when 


where  B  is  the  slope  of  the  line,  p  =  ot/ot,  and  it  is  assumed  that  e,  and  6, 
are  independent.  [BogBS85],  on  the  other  hand,  presents  empirical  results 
for  which  ODR  appears  preferable  to  OLS.  We  are  therefore  interested  in 
studying  the  question 

Under  what  conditions  is  ODR  preferable  to  OLS.  and ,  con¬ 
versely.  when  is  OLS  preferable  to  ODR  ? 

This  paper  is  a  first  approximation  to  answering  this  question. 

The  question  actually  has  two  parts.  First,  can  we  detect  a'practical 
difference  in  performance  between  ODR  and  OLS?  Second,  assuming  that 
differences  in  performance  are  detected,  can  we  characterize  the  conditions 
under  which  such  differences  occur  in  order  to  predict  when  one  method 
will  be  preferable  to  the  other? 

To  detect  a  difference  between  ODR  and  OLS  we  must  select  a  measure 
of  performance.  As  a  first  step,  we  have  chosen  to  investigate  the  estimated 
parameter  values,  3.  and  function  values,  f(xt:  3),  since  these  are  commonly 
of  interest  and  easily  understood.  For  both  we  use  three  standard  measures 
of  performance:  bias,  variance  and  mean  square  error. 

To  determine  performance  predictors  that  can  be  used  to  characterize 
a  priori  whether  a  data  set  should  be  solved  using  ODR  or  OLS,  we  re¬ 
examine  the  results  for  a  straight  fine  function.  The  ODR  solution  for 
a  straight  line  can  be  derived  by  noting  that  the  square  of  the  weighted 
orthogonal  distance  between  the  line  0\X  +  02  and  the  data  point  (x,.yt)  is 


(3}x,  +  02-V,)3 


If  we  assume  that  any  function,  whether  linear  or  nonlinear  in  x  and  8. 
is  at  least  approximately  a  straight  line  in  the  neighborhood  about  each 
individual  point  (x„  y, ),  then  the  square  of  the  weighted  orthogonal  distance 
between  f(x,3)  and  (x,.y,)  is 

_,3,  (/(  rt;  3)  —  y,)2 

9,(  l  ,  {8ji>l-.3u»Ty 


meaning  that  the  ODR  problem  (1.4)  can  be  approximated  by 

nun£ti»?0,(0).  (2.1) 

p'  .=i 

As  the  ratios 

p 

approach  0.  (2.1)  becomes  equivalent  to  the  OLS  problem  (1.3).  The  sizes 
of  the  ratios  h(x,:  3)  should  thus  be  related  to  the  question  of  when  ODR 
is  different  from  OLS. 

Unfortunately,  when  f(xt:  0)  is  not  linear  in  xt.  df(x,\ 0)/dx  and  there¬ 
fore  h(x,:  3)  varies  with  x,.  Consequently,  in  order  to  assess  a  single  num¬ 
ber  as  a  performance  predictor  of  ODR  in  relation  to  OLS  we  must  map 
df(x}:  3)/dx . df(xn:  3)/dx  into  a  single  value.  For  simplicity,  we  ini¬ 

tially  choose 

Q  =  max{|<?/(x,:  3)/dx\ ,  i  =  l,...,n}, 
i.e..  the  L ^  norm  of  [d/(xi:  3)/dx - ,df(xn;  0)/dx]1 .  as  the  mapping  and 

Q/P 

as  the  performance  predictor.  We  note  that  other  norms  could  be  chosen. 

Our  approach  for  this  initial  pilot  study,  described  in  detail  in  §3,  is  rel¬ 
atively  simple.  Briefly,  we  select  seven  functions  that,  although  clearly  not 
exhaustive,  are  ubiquitous  in  science  and  engineering,  and  seven  different 
values  of  p  that  are  used  with  each  function.  For  each  function,  we  also 
choose  two  parameter  sets  that  produce  different  values  of  Q.  Treating  p  as 
known,  we  then  examine  (a)  how  the  performance  of  ODR  and  OLS  varies 
for  an  individual  function  as  p  changes,  (b)  how  the  results  vary  between 
functions  and  (c)  how  well  the  performance  prediction  value  Q/p  forecasts 
the  observed  results. 

We  recognize  that  we  are  examining  our  data  under  ideal  conditions. 
Clearly,  p  and  Q  will  frequently  not  be  known  exactly,  but  this  does  not 
seem  to  1>e  bothersome.  In  our  experience.  Q  can  be  reasonably  estimated 
for  most  functions  and  data  sets.  Furthermore,  if  0(p)  is  the  estimate  of 
0°  for  a  given  value  of  p.  we  can  show  that  when  6  is  small,  then  d0{p)/dp 


is  also  small.  Thus,  the  value  of  0  should  not  change  much  as  p  is  varied. 
This  is  observed  in  other  experience  not  reported  in  this  paper.  For  the 
remainder  of  this  paper,  therefore,  we  assume  that  the  true  value  of  p  is 
known,  but  see  §5. 

3  Procedure 

In  this  section,  we  briefly  describe  the  details  of  our  Monte  Carlo  study. 


Our  study  examines  seven  functional  forms. 

/i(*.;0)  =  A*  .  y2  (3.1) 

f2(z,,0)  =  A**  +  A  (3.2) 

fzizi'.S)  =  A*2  +  d2x  4-  A  (3.3) 

/4(x,;5)  =  Aexp(Ax)  (3.4) 

=  Aexp(A*)+#3  (3.5) 

U(x,\3)  =  A  sin  (A*  +  2)  (3.6) 

/t(x,;  0)  —  0ls\n{32x  +  3s)  (3.7) 


We  have  selected  two  sets  of  parameter  values  for  each  function.  For  both. 

naax{|y“|  =  |/(x,;  0“)|.  t  *  «  1. 

In  addition,  the  first  parameter  set  is  chosen  so  that  Q  ss  1  and  the  sec¬ 
ond  so  that  Q  %  10.  The  data  sets  constructed  for  each  function  using 
the  first  parameter  set  thus  have  some  similar  attributes,  as  do  those  for 
the  data  sets  constructed  using  the  second  parameter  set.  even  though 
the  functional  forms  are  different.  The  parameter  sets  are  as  follows. 

Function  Parameter  Set  1  Parameter  Set  2 

/,  3a  =  (1.1,  0.9 )T  3°  =  (10.0,  — 2.0)t 

f2  0"  =  (0.3,  3.3 )T  0 8  =  (4.5.  -3.0)t 

h  0a  =  (0.3.  0.4.  3.3)t  0a  =  (4.5,  1.0,  -4.5)T 

U  3°  =  (0.4,  1.0  )T  0a  =  (1.2,  1.6  )T 

/s  0a  =  (0.4.  1.0.  -1.0)T  0a  =  (1.2.  1.6,  -5.0)t 

U  8°  =  (0.9.  1.1  )T  0a  =  (5.0,  2.0)t 

/t  0°  =  (0.9.  1.1.  2.0 )T  0 9  =  (5.0.  2.0,  1.0)T 

6 


For  all  functions  and  parameter  sets,  the  number  of  observations,  n,  is 
51,  and  x“, t  =  are  the  51  equally  spaced  values  over  the  interval 

[—1,1].  The  ranges  of  x“  and  y°  are  thus  comparable. 

We  analyze  each  of  the  seven  functions  using  both  sets  of  parameters 
and  seven  different  values  of  p  =  ot/<Ti u  namely  p  =  1,1.2.10,100,  and 

oo.  When  p  ^  oc,  we  analyze  the  data  using  both  methods,  ODR  and  OLS. 
(When  p  —  oc  the  two  methods  are  equivalent.)  A  total  of  182  combinations 
of  function,  parameter  set,  p,  and  method  are  considered. 

We  generate  two  groups  of  500  data  sets  each  for  this  collection  of  182 
combinations.  In  the  first  group,  a  tingle  set  of  values  i  and  6  is  used 
to  produce  the  actual  errors  f,°  and  for  each  of  the  problems  within  the 
collection.  The  problems  analyzed  within  the  collection  of  182  combinations 
in  the  first  group,  therefore,  are  not  independent.  In  the  second  group, 
a  different  set  of  errors  <  and  6  are  generated  for  each  of  the  problems 
within  the  collection.  The  problems  analyzed  within  the  collection  of  182 
combinations  in  the  second  group  are  independent.  The  first  group  allows 
us  to  make  pair-wise  comparisons  of  the  individual  results  obtained  using 
ODR  and  OLS.  and  enables  us  to  ensure  that  variations  in  performance  are 
not  artificially  induced  by  variation  in  the  data  used  within  the  collection. 
The  independent  data  used  for  the  second  group  open  our  analysis  to  a 
wider  range  of  statistical  tests. 

For  each  of  the  500  data  sets  in  both  groups,  we  generate  the  n  i.i.d. 

pseudo  random  values,  e,.  i  =  1 _ ,n.  from  a  normal  distribution  with 

mean  0  and  standard  deviation  .05.  and  the  n  values  6,,  i  =  l, - n,  also 

i.i.d.  normally  distributed  with  mean  0  and  standard  deviation  .05.  This 
set  of  values  i  and  6  is  used  to  produce  the  actual  errors  e“  and  6 “,  where 


-'(*)' 


-ferr)' 


for  each  of  the  seven  values  of  p.  (The  expected  sum  of  the  squared  errors 
is  thus  constant  over  the  seven  different  values  of  p. )  These  errors  ef (p) 
and  6°(p)  are  then  used  to  generate  the  "observations*'  y,(p)  and  x,(p)  for 


a  given  value  of  p,  where 

kOO  -  tf -«?(/>)  =  /W;0") -«?(/>) 

and 

Xi  «  +  6,°(p). 

The  errors  are  produced  using  the  Marsaglia  and  Tsang  [MaxTS4]  pseudo¬ 
normal  random  number  algorithm  as  implemented  by  James  Blue  and 
David  Kahaner  of  the  Scientific  Computing  Division  of  the  National  Bu¬ 
reau  of  Standards.  The  ODR  and  OLS  estimators  are  computed  using 
ODRPACIv.  Parameters  are  initialized  to  for  both  the  ODR  and  OLS 
solutions,  and  for  the  ODR  solutions  the  errors  in  r,  are  initialized  to  6° 
and  p  is  set  to  the  correct  value.  Using  3°  and  6°  for  starting  values  is  rea¬ 
sonable  since  in  this  study  we  are  only  interested  in  the  properties  of  the 
ODR  and  OLS  solutions  and  not  in  the  properties  of  the  estimation  proce¬ 
dures  used  to  obtain  them.  The  graphics  package  TEMPLATE  [MegSC]  is 
used  to  produce  the  plots. 

All  computations  are  performed  in  single  precision  on  the  CDC  Cyber 
205  at  the  National  Bureau  of  Standards.  Approximately  3400  seconds  cpu 
time  are  required  to  solve  the  1S2000  optimization  problems.  There  are  31 
trials  for  which  one  of  the  182  problems  failed  to  converge  in  200  iterations. 
Each  of  these  “failed”  trails  is  omitted  from  the  analysis,  and  another  trial 
substituted  in  its  place.  Such  a  small  percentage  of  failures  does  not  affect 
our  conclusions. 

4  Conclusions 

Our  study  addresses  two  main  issues: 

1.  the  relationship  between  the  performance  of  ODR  and  OLS  for  pa¬ 
rameter  and  function  estimation  as  determined  by  the  three  measures, 
bias,  variance  and  mean  square  error  (mse):  and. 

2.  to  a  lesser  extent,  how  the  performance  characteristics  of  ODR  and 
OLS  vary  for  different  values  of  Q  and  p. 


The  first  item  is  important  for  determining  whether  there  is  a  preferred 
method.  The  second  may  be  important  in  determining  how  to  choose  be¬ 
tween  the  two  methods.  In  this  section,  we  summarize  our  observations 
and  conclusions  regarding  these  two  issues.  A  more  detailed  description  of 
the  results  from  our  study  is  given  in  the  Appendix. 

Our  results  indicate  that  ODR  should  always  be  used  when  our  criteria 
are  relevant .  A  subroutine  library  such  as  ODRPACK  is  just  as  easy  to  use 
as  an  OLS  subroutine  library  and  is  no  more  computationally  expensive 
per  iteration.  More  importantly,  however,  for  all  of  the  measures  of  per¬ 
formance  examined  in  our  study.  ODR  is  seldom  seriously  worse  than  OLS 
and  is  frequently  significantly  better,  especially  for  p  <  2.  We  conclude 
that,  except  for  outliers,  ODR  results  in  smaller  bias,  variance,  and  mse  for 
both  parameter  and  function  estimates  than  does  OLS. 

•  Our  results  for  the  bias  of  the  parameter  and  function  estimates  are 
especially  clear. 

-  OLS  is  statistically  better  (as  described  in  the  Appendix)  only 
291  of  the  time. 

-  ODR.  on  the  other  hand,  is  appreciably  better  more  than  5091  of 
the  time,  and  the  largest  of  the  relative  differences  between  the 
ODR  and  OLS  biases  when  the  ODR  bias  is  closer  to  0  is  more 
than  250  times  the  largest  of  the  relative  differences  observed 
when  the  OLS  bias  is  closer  to  0. 

•  Our  results  for  the  variance  of  the  parameter  and  function  estimates 
also  decisively  favor  ODR  over  OLS. 

-  For  both  the  variances  of  the  parameter  estimates  and  the  vari¬ 
ances  of  the  function  estimates,  the  ODR  variance  is  appreciably 
smaller  than  the  OLS  variance  more  than  2391  of  the  time,  and 
in  over  1091  of  the  cases  the  ODR  variance  is  less  than  5091  of 
the  the  OLS  variance. 

-  Conversely.  OLS  results  in  appreciably  smaller  variance  than 
ODR  only  291  of  the  time. 


The  2%  of  the  time  that  OLS  has  appreciably  smaller  variance  than 
ODR  all  occur  for  the  same  two  data  sets  involving  the  sine  function 
(3.6)  and  (3.7)  when  p  —  ^ .  The  results  for  these  two  data  sets,  shown 
in  figures  4  and  5,  are  clearly  different  from  any  of  the  other  data  sets 
we  examined  in  that  they  contain  significant  outliers.  Further  analysis 
will  be  required  to  explain  their  anomalous  behavior. 

•  Our  results  for  the  mse  of  the  parameter  and  function  estimates  are 
essentially  the  same  as  those  observed  for  the  variance: 

-  ODR  results  in  appreciably  smaller  mse  approximately  25%  of 
the  time  with  the  ODR  mse  less  than  50%  of  the  OLS  mse  ap¬ 
proximately  20%  of  the  time,  while 

-  OLS  results  in  appreciably  smaller  mse  only  2%  of  the  time. 

Again,  the  times  OLS  has  smaller  mse  are  all  observed  for  the  two 
data  sets  that  affected  the  variance  in  the  analogous  manner  and  will 
require  further  study. 

The  500  data  sets  used  in  our  study  is  apparently  not  enough  to  confirm 
the  theoretical  results  reported  in  |ReiGL86]  that  indicate  OLS  should  be 
preferable  under  certain  conditions  when  the  function  is  a  straight  line. 
OLS  does,  in  fact,  produce  a  smaller  variance  and  mse  than  ODR  for  our 
linear  data  sets.  One  would  seldom,  if  ever,  call  the  difference  statistically 
or  practically  significant,  however.  This,  coupled  with  the  bias  data  that 
indicates  that  ODR  is  significantly  better  for  a  straight  line,  leads  us  to 
conclude  that  ODR  is  the  method  of  choice  for  our  criteria. 

Our  final  conclusion,  based  on  a  visual  examination  of  the  almost  200 
plots  generated  for  this  study,  is  that  Q/p  does  not  adequately  predict  the 
relationship  between  the  performance  of  ODR  and  OLS,  although  as  a  crude 
measure  it  does  have  its  merits.  Since  our  primary  interest  originally  was 
to  predict  when  one  should  prefer  ODR  to  OLS,  or  visa  versa,  and  since  we 
conclude  now  that  we  always  prefer  ODR  to  OLS  when  there  are  errors  in 
t  and  p  is  known,  the  failure  of  this  performance  predictor  is  not  important 
to  our  current  results.  Further  analysis  of  the  predictors  mentioned  in  §2 
may  be  required,  however,  when  we  examine  other  performance  criteria,  or 
when  we  examine  the  effect  of  not  knowing  p  exactly. 


5  Future  Work 


As  we  have  mentioned  before,  this  is  a  pilot  study,  and  as  one  would  expect 
from  such  a  study,  we  have  answered  some  questions  and  raised  others. 

There  are  clearly  additional  questions  to  be  answered  from  our  cur¬ 
rent  data.  The  anomalous  results  we  noted  for  functions  (3.6)  and  (3.7) 
definitely  require  further  analysis.  In  addition,  we  would  like  to  examine 
other  measures  of  performance,  and  the  structure  of  the  estimated  resid¬ 
uals.  Further  analysis  may  also  raise  the  need  for  different  values  to  help 
predict  whether  to  use  ODR  or  OLS  as  described  in  §2. 

Additional  studies  need  to  be  performed.  Our  current  plans  include 
an  examination  of  the  effect  of  using  an  incorrect  value  of  p.  ■  We  also 
plan  to  examine  other  functions  with  much  steeper  slopes  than  what  we 
allowed  in  this  study,  and  with  unequally  spaced  t  data.  Finally,  we  note 
that  least  squares  estimates,  whether  OLS  or  ODR,  are  not  robust  in  the 
presence  of  outliers  [FulST].  and  that  [GolVS3]  shows  that  ODR  problems 
are  more  ill  conditioned  that  the  corresponding  OLS  problem.  We  would 
therefore  like  to  experiment  with  diagnostics  and  resampling  techniques 
such  as  bootstrapping  that  could  be  used  to  indicate  when  the  ODR  results 
might  be  affected  by  ill  conditioning  or  outliers. 
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A  Results  and  Observations 


In  this  section,  we  present  a  detailed  description  of  the  results  and  obser¬ 
vations  that  support  the  conclusions  presented  in  §4. 

Because  of  the  large  amount  of  data  examined,  our  analysis  is  primarily 
graphical.  We  have  included  in  this  paper  only  representative  examples  of 
the  almost  200  plots  generated  for  this  study.  As  noted  in  §3,  our  study 
includes  two  groups  of  500  data  sets:  the  first  with  errors  e“  and  6a  within 
the  collection  of  1S2  combinations  of  function,  parameter  set,  p.  and  method 
that  are  not  independent;  and  the  second  with  errors  e*  and  6°  within 
the  collection  of  182  combinations  that  are  independent.  All  plots  shown 
here  are  from  the  first  group.  The  text  describes  additional  observations 
derived  from  the  second  group.  The  full  graphical  analysis  of  both  groups 
is  available  in  (BogDSSSTj. 

A.l  Parameter  Estimates 

A. 1.1  Bias 

Results.  To  determine  how  close  our  estimated  parameter  values.  3. 
come  to  the  actual  or  true  values,  d°,  *’e  examine  the  bias  of  the  estimated 
parameters,  i.e.,  the  values 

4-*? 

where  3}  designates  the  estimated  value  of  the  j-th  parameter  for  a  given 
problem  and  data  set.  and  3°  is  the  corresponding  true  value  of  the  param¬ 
eter. 

For  each  of  the  two  groups  of  data  sets,  we  display  the  500  parameter 
biases  obtained  for  each  of  the  parameters  in  the  collection  of  182  combi¬ 
nations  of  function,  parameter  set,  p,  and  method.  We  then  examine  each 
of  the  resultant  pairs  of  bias  estimates  obtained  using  the  two  methods. 

Figures  1,  2  and  3  show  these  results  for  each  of  the  parameters  and 
values  of  p  for  three  representative  combinations  of  function  and  param¬ 
eter  set.  These  three  figures  correspond  to  three  functions  of  increasing 
complexity.  Figure  1  shows  function  (3.1)  parameter  set  2.  a  straight  line 
function  with  slope  of  10.  Figure  2  shows  function  (3.3)  parameter  set  1. 
a  quadratic  function  in  r  with  maximum  slope  1  for  i  €  [—1,1],  Figure  3 


shows  the  results  for  function  (3.5)  parameter  set  2,  an  exponential  func¬ 
tion  that  is  nonlinear  in  both  x  and  0  and  that  has  maximum  slope  10  for 
*€[-1,1]- 

Figures  4  and  5  show  the  bias  results  for  the  sine  functions  (3.6)  and 
(3.7),  respectively,  both  using  parameter  set  2,  and  are  analogous  to  figures 
1,  2  and  3.  These  two  figures  are  not  typical  in  that  they  show  a  small 
number  of  outliers  when  p  =  no  outliers  are  observed  in  any  of  the 
other  parameter  bias  plots.  These  outliers  appear  to  adversely  affect  the 
corresponding  variance  and  mean  square  error  estimates,  as  discussed  in 
the  following  sections,  as  well  as  the  results  for  the  function  estimates  for 
these  data  sets. 

Each  column  of  icons  in  these  figures  represents  a  modified. box- and- 
whisker  plot  [Tuk77]: 

o  designates  the  median. 

+  designates  the  quartiles.  and 

o  designates  the  maximum  and  minimum. 

The  remaining  bias  values  from  each  of  the  500  data  sets  are  designated  by 
a  dot  (•).  The  values  are  grouped  by  p,  then  by  parameter  and  finally  by 
method.  Thus,  the  first  column  of  icons  on  each  of  these  figures  displays  the 
bias  observed  using  ODR  for  0\  when  p  =  the  second  column  displays 
the  bias  using  OLS  for  0\  when  p  =  etc. 

If  the  median  parameter  bias  is  determined  to  be  different  from  0  at  the 
.05  significance  level  using  a  two-sided  sign  test,  the  median  is  “flagged" 
with  a  check  ( yj)  plotted  above  the  corresponding  icon  for  the  maximum 
value.  (See.  e.g..  figure  1,  P  =  $>,  &  estimated  using  OLS.)  If  it  is  not 
different  at  the  .05  significance  level,  no  flag  is  shown. 

Observations.  The  parameter  bias  results  are  essentially  the  same  for 
both  groups  of  data. 

•  In  more  them  33^  of  the  ODR/OLS  pairs,  the  sign  test  indicates  that 
the  median  ODR  bias  is  0  (i.e..  that  the  hypothesis  that  the  median 
ODR  bias  is  equal  to  0  would  not  be  rejected  at  the  .05  significance 
level)  when  the  median  OLS  bias  is  not  0. 


•  In  fewer  than  2%  of  the  ODR/OLS  pairs,  the  sign  test  indicates  the 
ODR  median  is  not  0  when  the  OLS  median  is  0. 

•  In  approximately  25%  of  the  ODR/OLS  pairs,  the  sign  test  indicates 
that  both  medians  are  not  0.  Of  these  cases,  when  it  is  possible  to 
visually  detect  a  difference  in  the  medians  of  these  pairs,  the  median 
ODR  bias  almost  always  closer  to  zero. 

For  the  largest  of  the  differences  between  the  median  ODR  and  OLS 
biases,  the  median  ODR  bias  is  always  closer  to  0  than  the  median  OLS 
bias.  The  largest  differences  occur  for  values  of  p  <  1,  with  the  differences 
increasing  as  p  decreases.  As  noted  in  the  next  section,  the  variance  of  the 
ODR  results  is  generally  as  small  or  smaller  than  that  of  the  OLS  results. 

A. 1.2  Sample  Variance 

Results.  The  sample  variance,  a]  ,  of  the  parameter  estimate.  3j.  is  a 
measure  of  the  variability  of  the  estimate  about  its  average  value.  To 
examine  the  relationship  between  the  variance  of  the  ODR  parameter  es¬ 
timates  and  variance  of  the  OLS  parameter  estimates,  we  plot  the  base  10 
logarithm  of  the  ratios  of  the  sample  variances  for  each  of  the  estimated 
parameters,  i.e., 


These  plots  allow  us  to  examine  the  relationship  between  the  individual 
variance  pairs  as  well  as  how  the  relationship  changes  as  a  function  of  Q 
and  p.  All  resultant  variance  ratios  are  examined. 

Figures  6,  7  and  S  are  representative  examples  of  the  variance  plots, 
and  show  the  variance  ratios  for  the  data  shown  in  figures  1,  2  and  3. 
respectively.  The  icon  used  for  each  ratio  is  the  number  of  the  subscript  of 
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&■  i.e.,  the  ratio  of  the  variances  of  Bx  are  plotted  using  the  symbol  “1,"  the 
ratio  of  the  variances  of  P2  are  plotted  using  the  symbol  “2,”  etc.  Note  that 
the  columns  of  icons  in  these  figures  are  in  the  same  order  with  respect  to 
p  as  the  “Parameter  Bias”  plots. 

For  the  first  group  of  data,  the  errors,  and  therefore  the  variances,  are 
not  independent.  We  thus  use  a  two-sided  Pitman  Nearness  Test  [Rao86] 
to  make  a  pair-wise  comparison  of  the  500  deviations 

|dj  —  median {  ?,}| 

obtained  for  each  parameter  of  each  function  and  parameter  set  using  each 
of  the  values  of  p.  Note  that  we  expect  the  dependence  introduced  by  the 
sample  median  to  be  small.  If,  using  this  test,  we  reject  the  hypothesis  that 
the  deviations  from  the  two  methods  are  the  same  at  the  .05  significance 
lever,  we  “flag'’  the  appropriate  icon  with  an  asterisk  ( * ).  On  these  variance 
plots,  we  also  indicate  the  magnitude  of  the  ratios  with  the  two  lines  marked 
“20  PERCENT."'  An  icon  falling  above  the  upper  of  these  two  lines  indicates 
the  ODR  variance  is  more  than  20 %  bigger  than  the  OLS  variance.  An 
icon  falling  below  the  lower  of  these  two  lines  indicates  the  OLS  variance 
is  more  than  20%  bigger  than  the  ODR  variance. 

For  the  second  group  of  data,  the  errors,  and  therefore  the  variances, 
are  independent.  We  are  thus  able  to  test  whether  each  variance  ratio  is 
different  from  1  at  the  .05  significance  level  using  a  two-sided  F-test. 

Observations.  In  the  first  group  of  data,  the  two-sided  Pitman  Nearness 
Test  indicates  the  deviations  from  the  ODR  fit  are  different  than  those 
obtained  from  the  OLS  fit  101  times  out  of  the  resultant  204  pairs. 

•  In  83  of  the  101  cases,  the  ODR  variance  is  smaller  than  the  OLS 
variance.  These  83  cases  include  all  62  cases  where  the  OLS  variance 
is  more  than  20%,  larger  than  the  ODR  variance,  and  the  41  times 
that  the  OLS  variance  is  more  than  twice  the  OLS  variance. 

•  In  18  of  the  101  cases,  the  OLS  variance  is  smaller  than  the  ODR 
variance.  These  18  cases  include  only  2  of  the  5  times  that  the  ODR 
variance  is  more  than  20%  larger  than  the  OLS  variance,  and  they 
include  neither  of  the  2  times  that  the  ODR  variance  is  more  than 
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twice  the  OLS  variance.  (Each  of  the  5  cases  where  the  ODR  variance 
is  more  than  20 %  larger  than  the  OLS  variance  occur  for  the  sine 
functions  (3.6)  and  (3.7)  using  parameter  set  2  at  p  —  ^.) 

For  the  second  group  of  data,  the  F-test  indicates  that  the  variances 
using  ODR  and  OLS  are  different  at  the  .05  significance  level  69  times. 

•  In  64  of  these  69  cases,  the  ODR  variance  is  significantly  smaller  than 
the  OLS  variance,  and  in  36  of  the  64  cases  the  ODR  variance  is  less 
than  50 {7k  of  the  OLS  variance. 

•  In  5  of  these  69  cases,  the  OLS  variance  is  significantly  smaller  than 
the  ODR  variance,  and  the  OLS  variance  is  less  than  50V?  of  the  ODR 
variance  in  2  of  these  5.  Both  of  these  2  cases  occurred  in  the  sine 
functions  results  using  parameter  set  2  at  p  =  The  bias  results  for 
these  two  functions,  like  the  results  shown  in  figures  4  and  5  for  the 
first  group  of  data,  include  a  small  number  of  outliers  which  appear 
to  be  responsible  for  the  increased  ODR  variance.  Such  outliers  were 
not  observed  in  any  of  the  other  results,  including  functions  (3.6)  and 
(3.7)  using  parameter  set  2  when  p>\- 

For  both  the  first  and  second  data  groups,  the  largest  differences  be¬ 
tween  the  variances  occur  for  the  smaller  values  of  p,  and,  with  the  exception 
of  the  sine  function  data  using  parameter  set  2  at  p  =  when  the  differ¬ 
ence  between  the  ODR  and  OLS  variances  are  large,  the  ODR  variance  is 
the  smaller  of  the  pair. 


A. 1.3  Mean  Square  Error 


Results.  The  mean  square  errors  (mse)  of  the  estimated  parameters  are 
measures  of  the  variability  of  the  estimate  9}  about  its  true  value.  9°.  We 
examine  the  relationship  between  the  mse  observed  using  ODR  and  OLS 
in  the  same  manner  that  we  analyze  the  sample  variance  of  the  parameter 
estimates.  We  plot  the  base  10  logarithm  of  the  ratios  of  the  mse  of  the 
estimated  parameters,  i.e.. 


log 
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against 

-log 

All  resultant  mse  ratios  are  examined. 

Figures  9,  10  and  11,  corresponding  to  the  data  shown  in  figures  1,  2 
and  3.  respectively,  are  representative  examples  of  these  plots.  Like  the 
sample  variance  plots  discussed  in  §A.1.2,  the  icon  used  for  each  ratio  is 
the  number  of  the  subscript  of  $.  For  both  groups  of  data,  we  indicate 
the  magnitude  of  the  ratios  with  the  two  fines  marked  “20  PERCENT.”  For 
the  first  group  of  data,  we  also  use  a  two  sided  Pitman  Nearness  Test  to 
make  a  comparison  of  the  500  values  of  |d;  —  |  observed  for  each  of  the 

parameters  of  each  of  the  functions  at  each  of  the  values  of  p. '  If.  using 
this  test,  we  reject  the  hypothesis  that  the  magnitudes  of  the  biases  for  the 
two  methods  are  not  the  same  at  the  .05  significance  lever,  we  “flag”  the 
appropriate  icon  with  an  asterisk  (*). 

Observations.  For  the  first  group  of  data,  with  errors  that  are  not  inde¬ 
pendent.  the  Pitman  Nearness  Test  indicates  that  the  deviations  from  the 
ODR  fit  are  different  than  those  obtained  from  the  OLS  fit  119  times  out 
of  the  204  resultant  ODR/OLS  pairs. 

•  In  100  of  the  119  cases,  the  ODR  mse  is  smaller  than  the  OLS  mse. 
These  100  include  most  of  the  91  ratios  for  which  the  OLS  variance  is 
more  than  20%  larger  than  the  OLS  variance,  and  all  of  the  53  cases 
for  which  the  observed  ODR  mse  is  less  than  50%  of  the  OLS  mse. 

•  In  19  of  the  119  cases,  the  OLS  mse  is  smaller  than  the  ODR  mse, 
including  all  3  cases  (each  resulting  from  the  sine  functions  with  pa¬ 
rameter  set  2  at  p  =  ^ )  for  which  the  ODR  variance  is  more  than 
20%  larger  than  the  OLS  variance. 

For  the  second  group  of  data,  with  errors  that  are  independent .  we  have 
not  performed  any  statistical  test  of  significance. 

•  We  observe,  however,  that  for  90  of  the  204  resultant  ratios  the  OLS 
mse  is  more  than  20%  bigger  than  the  ODR  mse.  and  in  54  of  these 
the  OLS  mse  is  more  than  twice  the  ODR  mse. 


•  Also,  the  ODR  mse  is  more  than  20%  larger  than  the  OLS  mse  in 
only  4  of  the  204  resultant  ratios,  and  the  ODR  mse  is  more  than 
twice  the  OLS  mse  in  only  2  of  these.  (Both  of  the  two  ratios  for 
which  the  ODR  mse  is  twice  the  OLS  mse  again  result  from  the  sine 
functions  using  parameter  set  2  for  p  —  -k.) 

Like  the  variance  results,  for  both  groups  of  data  the  largest  differences 
occur  for  the  smaller  values  of  p.  Also,  for  the  largest  differences,  the  ODR 
mse  is  almost  always  the  smaller  of  the  two  mse. 

A.2  Function  Estimates 

A. 2.1  Bias 

Results  To  determine  how  close  the  function  estimates.  /( r,;  3).  come  to 
the  actual  or  true  values,  /(x,:  3a).  we  examine  the  biases  of  the  function 
estimates,  i.e..  the  values 

/(x,:i)-/(x,:d°) 

where  3  designates  the  estimated  value  of  the  parameters  for  a  given  prob¬ 
lem  and  data  set.  These  are  computed  for  3  representative  values  of  x,  over 
the  interval  [-1.1],  namely,  x  =  —1.0  and  1.  The  data  for  each  of  the  two 
groups  of  data  sets  are  examined  using  modified  box-and- whisker  plots  as 
were  the  parameter  bias  data.  All  resultant  pairs  of  function  estimates  are 
examined. 

Figures  12.  13  and  14  show  representative  examples  of  these  plots,  and 
present  results  for  the  same  data  sets  as  those  analyzed  in  figures  1,  2 
and  3.  These  figures  are  completely  analogous  to  the  parameter  bias  plots 
discussed  in  §A.1.2.  Again,  we  test  whether  the  median  bias  of  the  function 
estimate  is  different  from  0  at  the  .05  significance  level  using  a  two-sided 
sign  test,  indicating  medians  that  are  not  0  according  to  this  test  with  a 
check  (v/)- 

Observations.  The  bias  results  for  the  function  estimates  are  almost 
exactly  the  same  for  both  data  groups. 


•  In  more  than  4091  of  the  ODR/OLS  pairs,  the  sign  test  indicates  that 
the  median  ODR  function  bias  is  0  while  the  median  OLS  function 
bias  is  not  0. 

•  In  fewer  than  2%  of  the  ODR/OLS  pairs,  the  sign  test  indicates  that 
the  median  OLS  function  bias  is  0  while  the  median  ODR  function 
bias  is  not  0. 

•  In  the  approximately  2091  of  the  ODR/OLS  pairs  that  both  medians 
are  not  0.  the  ODR  median  is  almost  always  closer  to  0  than  the  OLS 
median,  and  sometimes  appreciably  so. 

For  the  largest  of  the  differences  between  the  medians,  the  median  ODR 
function  bias  is  always  closer  to  0  than  the  corresponding  median  OLS  func¬ 
tion  bias.  As  was  true  for  the  parameter  bias  results,  the  largest  differences 
occur  for  values  of  p  <  1,  with  the  differences  increasing  as  p  decreases. 
Also,  as  noted  in  the  next  section,  the  variance  of  the  ODR  results  are 
generally  as  small  or  smaller  than  the  corresponding  OLS  variance. 


A. 2.2  Sample  Variance 

Results.  The  sample  variance,  a*  is  a  measure  of  the  variability  of 

the  function  estimate  f(x,:d)  about  its  average  value.  To  examine  the 
relationship  between  the  variance  of  the  ODR  function  estimates  and  the 
variance  of  the  OLS  function  estimates,  we  plot 


log 
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=  log 
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against 


Q 

—  log  — 

[p\ 


for  each  of  the  function  estimates  observed  at  the  three  selected  values  of 
t,.  All  resultant  variance  ratios  are  examined. 
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Figures  15,  16  and  17  are  representative  examples  of  these  variance 
plots,  and  show  the  variance  ratios  corresponding  to  the  function  bias  data 
shown  in  figures  12,  13  and  14.  respectively.  The  format  is  analogous  to 
that  used  for  the  parameter  variance  plots,  discussed  in  §A.1.2.  For  the 
first  data  group,  in  which  the  errors  are  not  independent,  we  test  whether 
the  deviations 


|/(r,;  0ODR)  -  median  {/(r,:  3C 


are  different  from 


|/(  r, :  3°LS  )  -  median  {/( r, ;  3OLS )}  | 


at  a  .05  significance  level  using  a  two-sided  Pitman  Nearness  Test,  -flag¬ 
ging”  with  an  asterisk  (*)  the  ratios  for  which  the  absolute  values  of  the 
deviations  are  found  to  differ  at  this  significance  level.  For  the  second 
group,  in  which  the  errors  are  independent,  we  test  whether  each  variance 
ratio  is  different  from  1  at  the  .05  significance  level  using  a  two-sided  F-test. 


Observations.  In  the  first  group  of  data,  the  above  mentioned  two-sided 
Pitman  Nearness  Test  indicates  that  the  deviations  obtained  using  ODR 
are  different  from  those  obtained  using  OLS  at  the  .05  significance  level  117 
times  out  of  252. 


•  In  84  of  the  117  cases,  the  ODR  variance  is  smaller  than  the  OLS 
variance.  These  84  cases  include  59  of  the  68  cases  where  the  OLS 
variance  is  more  than  209c  larger  than  the  ODR  variance,  and  all  of 
the  32  cases  where  the  OLS  variance  is  more  than  twice  the  ODR 


variance. 


•  In  33  of  the  117  cases,  the  OLS  variance  is  smaller  than  the  ODR 
variance.  These  33  cases  include  4  of  the  5  cases  that  the  ODR 
variance  is  more  than  20‘X  larger  than  the  OLS  variance,  and  3  of  the 
4  cases  where  the  ODR  variance  is  more  than  twice  the  OLS  variance. 
(The  5  cases  that  the  ODR  variance  is  more  than  20 91  larger  than 
the  OLS  variance  all  occur  for  the  sine  functions  using  parameter  set 


2  when  p  = 


In  the  second  group  of  data,  the  F-test  indicates  that  the  variances 
using  ODR  and  OLS  are  different  at  the  .05  significance  level  71  times  out 
of  the  252  resultant  ra.ios. 

•  In  63  of  the  71  cases,  the  ODR  variance  is  significantly  smaller  than 
the  OLS  variance,  and  in  28  of  these  63  cases  the  ODR  variance  is 
less  than  50 %  of  the  OLS  variance. 

•  In  8  of  the  71  cases,  the  OLS  variance  is  significantly  smaller  than  the 
ODR  variance,  and  the  OLS  variance  is  less  than  50‘/f  of  the  ODR 
variance  in  2  of  these  8.  Both  of  these  2  again  result  from  the  sine 
functions  using  parameter  set  2  at  p  = 

For  both  groups,  the  largest  differences  between  the  variances  occur  for 
the  smaller  values  of  p.  and.  with  the  exception  of  the  sine  function  results 
for  parameter  set  2  when  p  =  when  the  differences  between  the  ODR 
and  OLS  variances  are  large,  the  ODR  variance  is  the  smaller  of  the  two. 

A.2.3  Mean  Square  Error 

Results.  The  mse  of  the  function  estimate  is  a  measure  of  the  variability 
of  the  estimate  f(x,:3)  about  its  true  value.  We  examine  the 

relationship  between  the  mse  of  the  function  estimate  observed  using  ODR 
and  OLS  in  the  same  manner  that  we  analyze  the  mse  of  the  parameter 
estimates,  namely,  we  plot 

,  (nr,:3ODR)-f(x,:3a))2' 

|Qg  _  .  O 

.23  (/<*:*>«>-/<*:/»•))  . 

against 

-log 

All  resultant  mse  ratios  are  examined. 

Figures  IS.  19  and  20.  corresponding  to  the  data  shown  in  figures  12. 13 
and  13,  respectively,  are  representative  examples  of  these  plots.  The  format 
for  these  plots,  and  the  analysis  performed,  is  analogous  to  that  described 
in  §A.1.3. 
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Observations.  For  the  first  group  of  data,  with  errors  that  are  not  inde¬ 
pendent.  the  Pitman  Nearness  Test  indicates  that  the  deviations  from  the 
ODR  fit  are  different  than  those  obtained  from  the  OLS  fit  157  out  of  the 
252  resultant  ODR/OLS  pairs. 

•  In  132  of  the  157  cases,  the  ODR  mse  is  smaller  than  the  OLS  mse. 
These  132  include  all  of  the  111  ratios  when  the  OLS  mse  is  more 
than  20 {A.  larger  than  the  ODR  mse.  and  all  of  the  64  ratios  when  the 
OLS  mse  is  more  than  twice  the  ODR  mse. 

•  In  25  of  the  157  cases,  the  OLS  mse  is  smaller  than  the  ODR  mse. 
including  all  4  ratios  for  which  the  ODR  mse  is  more  than  20 of  the 
OLS  mse.  and  the  3  ratios  for  which  the  ODR  mse  is  more  than  twice 
the  OLS  mse.  (All  4  of  these  cases  again  occur  for  the  sine  functions 
using  parameter  set  2  when  p  =  ^.) 

For  the  second  group  of  data,  with  errors  that  are  independent,  we  have 
not  performed  any  statistical  test  of  significance. 

•  We  observe,  however,  that  in  112  of  the  resultant  252  ratios,  the  OLS 
mse  is  more  than  20 7<  larger  than  the  corresponding  ODR  mse.  and 
in  59  cases  the  OLS  mse  is  more  than  twice  the  corresponding  ODR 
mse. 

•  Also,  we  observe  that  the  ODR  mse  is  more  than  20%  larger  than  the 
OLS  mse  in  only  3  cases,  and  is  more  than  twice  the  corresponding 
OLS  mse  in  only  2  cases. 

The  mse  results  are  thus  similar  to  those  we  observed  for  the  other 
performance  measures  in  that  both  groups  of  data  show  that  the  differences 
in  the  mse  increase  as  p  decreases.  Again,  for  the  larger  differences,  with 
the  exception  of  the  outlier  data  in  the  sine  function  results,  the  ODR  mse 
is  always  smaller  than  the  OLS  mse. 


Figure  2 


Figure  3 
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RATIO  OF  SAMPLE  VARIANCES  OF  PARAMETER  ESTIMATES 
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Figure  13 
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Classical  or  ordinary  least  squares  (OLS)  is  one  of  the  most  commonly  used  criteria  for  fitting  data  to  models  and  for 
estimating  parameters.  This  is  true  even  when  a  key  assumption  for  its  use,  namely  that  the  independent  variables  are  known 
exactly,  is  violated.  Orthogonal  distance  regression  (ODR)  extends  least  squares  data  fitting  to  problems  with  independent 
variables  that  are  not  known  exactly.  Theoretical  analysis,  however,  shows  OLS  is  preferable  to  ODR  for  straight  line  func¬ 
tions  under  certain  conditions,  even  when  there  are  measurement  errors  in  the  independent  variable.  This  has  lead  some  to 
conjecture  that  under  some  similar  conditions  OLS  will  also  be  preferable  to  ODR  for  nonlinear  functions  even  though  there 
are  errors  in  the  independent  variable. 

In  this  paper,  we  present  the  results  of  an  empirical  study  designed  to  examine  whether  ODR  provides  better  results  than 
OLS  when  there  are  errors  in  the  independent  variable.  We  examine  a  variety  of  functions,  both  linear  and  nonlinear,  under 
a  variety  of  experimental  conditions.  The  results  indicate  that,  for  the  data  and  performance  criteria  considered,  ODR  never 
performs  appreciably  worse  than  OLS  and  sometimes  performs  considerably  better.  This  leads  us  to  the  conclusion  that 
ODR  is  appropriate  for  a  wide  variety  of  practical  problems. 
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